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ABSTRACT 



* 



Hypothesis generation has been proven to be a ^crucial phase 
^inf the clinical inquiry. The current instruments for measuring 
clinical problem-solving skills^ however, are unable to 
differentially assess the hypothesis generating ability. For 
assessing this particular capability a new test is described, it 
is based upon exposing the examinee to an unrealistic, 
hypothetical, and thus unfamiliar context. A wide range of 
alternative data are presented, from which the examinee is 
required to choose those which" fit his or her hypothesis, avoiding 
internal unconsistencies. Construct validation, both discriminant 
and convergent is presented, demonstrating independence of the - 
test on the depth of the knowledge of the content areas from which 
itr is" derived; at the same time achieving significant correlation 
with the scores on Patient-management-problems. This later 
correlation increases as the PMP further diverges from the 
recognizable -reality. Some variations of, the "unrealistic 
simulation approach" are; proposed . These may correspond with the 
Various stages in the medical education. It is suggested that 
this test be ulsed as a supplementary to the PMPs. 
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Clinical simulations have become recently -a rather frequently 
use* tool for both instruction and evaluation in medical 

> education. . McGuire et al (1976), pioneers in the field, define a 

simulation as a reflection of the reality reduced to its essence, 
in which the learner (or the examinee) is confronted with a 
problematic situation and is required to embark upon a series of- 
inquiries, decisions and actions. 

This 'realistic- technique has both strengths and weaknesses, 
which derive from the fact that it is designed to approximate a 
given reality. The advantages of' using it for evaluation has been 
extensively described elsewhere and include: perceived relevance; 
standardization of the task; a wide range of sampling of 
competencies; objective ratings, and fast feedback (McGuire ei al, 
1976] Neufeld, 1977). Even more significant is Some evidence of 
its criterion validity, (McGuire & Babbott, 1967), although this 
.issue is still debatable (Goran et al', 1973). The disadvantages 
which have been described include a difficulty in .simulating some 
aspects of the reality, and an incomplete measurement of some 
competencies as, for example, factua* knowledge (McGuire et al, 
,1-976). It is suggested, that the simulation technique may have two 
additional limitations, both stemming from the concept of * 
reflecting reality: a limited br impossible utilization of the 
instrument in the early phases of medical .education, and 
confounding of the mental processes involved in . problem-solving . 

In order to measure performance in a simulated reality, the 
learner has first to ^acquire a good grasp of that reality. Posing 



a realistic clinical problem to a freshman student within the^ 
framework of a) traditional curriculum will be either highly' 
irrelevant, or unrealistic, or both. Thus the use of the 
technique is confined to the later phases of medical education. 
However, the .acquisition. of problem solving skills in the early, 
formative years is regarded as of an utmost importance* (Dewey, 
1916, V Neufeld, .1977) and some schools have adopted an 
interdisciplinary integrated problem-solving approach from the 
commencement, of studies (Neufeld, 1977; Bouhujis et al, 1978; 
Benor et al, 1979) . 



The second limitation of the simulation "technique, also 
stemming from its realism, is of greater concern. The mental 
processes involved in problem-solving have been recently " 
illuminated, through extensive research. 'Guilford' and Hoepfner. 
(1971) .suggested a four-stage process including: memory 
operations; divergent production; cognition;' and evaluation 
operations. These , roughly correspond "to the findings of Elstein 
and his collaborators (1978), who defined the four stages of 
clinical inquiry in terms of: cue acquisition; hypotheses 
generation; cue, interpretation; and hypotheses evaluation. Culter 
(1979) recently has described a variety of strategies used in the 
process of prdblem-solvUig . However, there is a uni-phasic 'short 
cut \ o entitled by the two last authors as pattern recognition or 
pattern matching which is widely used by practitioners on numerous 
occasions. This more economical heuristic process is a 



recognition of a pattern, syndrome or cluster of cues, which give 
Xi.se to an almost reflexive response. 



Contemporary medical education aims at developing the ' 
clinical inquiry approach (Elstein et al, 1978), which is 
systematic and analytic in nature. Heuristic 'jumps', as the 
pattern, recognition, are permitted insofar as they are later 
analytically evaluated ; cue" recognition should' be preceded by 
active cue acquisition, and supplemented by cue interpretation. 
Herein lies the difference between the apprenticeship approach of 
the*>ld days, aimed at increasing the pattern repertoire of the 
learner in order to enable acquisition of readily recognized sets 
and reflexive responses, and the post-Plexner ian approach. The 
present communication -suggests that the realistic simulation 
technique cannot differentiate between the -analytic and the 
heuristic modes of thinking, it is further suggested that a 
differentation is needed both for educational planning , and for. 
diagnostic purposes of identification of students who require 
remedial intervention, it is particularly required in the early 
phases of education, when thinking habits are internalized. 

/ Following is a presentation of an 'unrealistic simulation- 
technique designed and utilized especially for the evaluation of 
the hypothesis generation stage of the problem-solving process. 
The way it deals with the issue of lack of relevance is late* 
discussed.. The instrument has been implemented for the last five 



years in the Faculty of Health Sciences, Ben-Gurion University of 
the Negev, Israel (BGU>, as a sub-test of the summative 
examination taken by first year students in a six'year curriculum. 

> 

BACKGROUND 

The hereby presented instrument entitled "Hypothetic Organism 
Test" (HOT) and nicknamed "The Monster" should be viewed against'- ' 
the background of the first year curriculum. A detailed 
description of the BGU curriculum is -available elsewhere (Segall 
et al, 1978), as is its integrative nature (Benor et al,< 1979). 
Therefore only the content area^related to the test will be 
briefly sketched. However, the objectives of the test reach 
beyond its actual contents, and are pertinent to the other 
constituents of both the concurrent integrative curriculum and to 
later phases of the course. 



The science component' of the first year program is presented 
in an integrative format along organ system lines. The multiple 
solutions found in nature to problems of survival form a 
background against which the human solution is considered. Human 
physiology and ecology are studied within a wider biological 
perspective. The concept of the basic needs of a living organism 
are raised, such as nutrition, energy production and preservation, 
and coordination. Pertinent zoological examples are presented in 
this context. Systematic zoology is not studied, nor are 
morphological details emphasized. The course is taught on a 



Phenomenological level, and thus stress the observable phenomena 
and the underlying principles rather than mechanisms and detailed 
explanations. Appropriate components of physics, mathematics and 
chemistry are tightly interwoven into the course. 

Th. clinical component of th. ,first.year program calls upon 
encountering real patient problems in various clinical settings. 
While the main. objectives of this component ate within the realm 
of "human interrelations (Segall ,t .1, 1978). the, student is also 
expected to apply the knowledge and skills acquired in the science 
courses to clinics! raality as well as to public health issues. 
An extensive formative evaluation scheme is conducted throughout 
the year along both disciplinary and interdisciplinary lines. The 
summative evaiuation is based on a single both comprehensive and 
integrative examination « the end of the year, it comprises . 
several subtests, some of which are case histories. Both 
scientific and clinical knowledge is objectively evaluated in 
conjunction with these presented cases. Another subtest ^ 

hot. . 



THE INSTRUMENT * 

The examinee is required to 'construct- a hypothetical 
creature that should fit given environmental conditions specified 
in the introductory narrative. The environment may be either real 
(e.g., desert, tropic island, marine), or imaginary . (e.g., high 
seas after a nuclear disaster- which has changed the wat«'s 



characteristics). Only one > envi ronment is designed' for each test. 

A data list provides both pertinent and irrelevant information 
/about the environment (e.g., climate, altitude, chemicals in the 

water, fc>od supply); and about some behaviors of the creature to 

be d«iign«d (e.g., -dominant", wat found both in the mountain' area 

and on the sea shore). 

& ■ 

Thirty 'building blocks' are presented, formulated in a 
multiple option format, from which the examinee is instructed to 
select one option. Each 'block' relates to either structure, 
substance or process in one of the organisms body" systems. Table 
1 presents some examples of 'building blocks'. Table 2 provides 
additional details relinquishing the multiple option format for , 
convenience of presentation. 



Insert Tables 1 andU 2 here. 
Additional blocks deal with perception and neural mechanisms, 
excretion, eating, drinking and hunting behavior, regulation of 
blood pressure, etc. The building blocks are presented at a 
random order. Thus for example, the eating behavior, structure of 
intestine, digestive enzymes, water metabolism and tonicity of the 
extracellular fluid, are blocks numbers 19, 28, 21 , 3 and 7 
respectively, in addition one open-ended question enables the 
student to describe the constructed creature. 



* 



' • . • _ . : .• - • ; n 

The examinee is instructed to make his 05, her choices in a 
way that the constructed systems" as defined by the. chosen options, 
would not contradict each other., Moreover, the organism should 
fit the given environmental conditions. This includes/ of course, 
the open-ended description. The students are encouraged to use* 
their imaginations freely, m order., to minimize the tendency 
toward selecting the options characteristic to human beings, best 
known to the students, some of the blocks do not include the human 
solution (e.g., iron is not included among the options , of the ' 
respiratory pigment block), The student is thus requested to act 
better than nature did, and to constitute an ideal non-human 
organism. Students are, of course, unable to do this, and 

i c t i ort • 

f A perfect performance is- thus 
defined as having not more than one contradiction. 



SCOR ING 

Plow charts are designed as, for example, those described 
above. related to alimentatioh. The student »s, choices are checked 
. against lists of both non-permitted and required responses. > For 
example, choosing the option of "acting >only in the day time" 
(block 1) excludes both the option of -constant body temperature 
at 38 -40 centigrades- (block 4) as the given environment is 
cold at night, and also the option -most of visual receptors are 
rods- (block 17). But it requires one of the thermal protection 
structures, such as fur, feathers or fat (block 10).. Each block 
can be enclosed in more than one flow chart. Each response' may 
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'be in accord with others in a certain flow chart,, yet should 

i . .. *- .. 

/ not contradict a 4 response in anotner charts 

The student is penalized for each contradiction. The final 
negative scores are transformed to standard scores with no more 
f ,than six contradictions, including the permitted one, being 
{ / , allowed to the minimally performing student. \ % The originality of 
. the student's solutions is assessed by two scorers, and these 
points are added to the student's credit, contributing up to 10 
per cent of the final score. A response is considered original 
insofar as the solution deviates from a description of a human - 
being. A walking fish,- if meeting the environmental conditions; 
is superior to a cat. \ 



. VALIDATION ., 

The face validity of the 1 test 'is not. at all obvious. - -^ 
Although the content is directly derived from the learned subject 
- matter, the task, however, is of a unique nature, , never 

Encountered before by any examinee. Because of this. uncertainty, 
the test'was regarded as largely experimental until validation. 

The establishment of construct validity (Cronbach & Meehl, 
. 1967) was two-fold. Discriminant validity was determined by 

correlating the HOT scores with those obtained over the different 
questions in the other sub-test of the some examination, dealing 



with biological and related scientific material. The results ^show 
a low and non-significant correlation coefficient of .06 (table 
3), indicating that HOT measured a quality which is independent of 
the related .factual knowledge, in sp.ite of the fact that a 

• * * » 

con.id.rabl. proportion of th. knowl.dg. qu.ation. in tho.. other- 

sub tests we're on a high cognitive level. ' " 

' . . •• ' - • 

% .The convergent validation o^ the test had to be postponed for 
four, years, until assessment of the clinical performance of the 
first classes who took the HOT became available, it then was 
possible to see its correlation with achievements; on the familiar 
branching patient ' management" problems (PMPs), taken- within the 
framework of the obstetrics- gynecology, pediatrics and primary 
care clerkships (years 4 & 5). A moderate yet significant 
correlation of .26 (p < .05) was found,, .'a higher correlation of' 
.37 (p < .05) was found with PMP in internal medicine final 
examination (year 6). Moreover, an unplanned occasion occurred, 
..in which. a PMP in the primary care clerkship was anrtull.d by' the 
^teachers because it'dea.Jt with a rare jmd unfamiliar condition. 
This PMP required application of problem solving skills to an 
unrecognized, '• theoretical ' •si tuation;. The correlation with the. 
HOT taken by the same students four years earlier was .43. (p < 
.01) (table 3), These correlations may be" seem q^ite moderate, 
accounting for not more than 20 percent of the variance in . the 
later 4 year^ However, they indicate better* pr edict ivity than is 
usually obtained by" tests. in medical education. .Indeed, higher 



correlations may question any possible effect of* education over 
the years. 

The HOT is a monotrait Lest. Thus a construct validation by 
the multitra,it r multimethod matrix {Campbell & Fiske,, 1959) i' B 
impessibl?. However, 'HOT ^ just a one subtest of Veven in the 
first year integrative Examination, which is a multi\rait test. 
Similarly the PMP is but a subtest in the evaluation of\ students' 
clinical performance in the later years.' When "the 
multitrait-multimethod model is applied to both the, early and the 
clinical evaluation instruments, an additional construct 
validation emerges indicated by the,, 'validity diagonal '/(table 3) ^ 



insert Table' 3 about hese 



Several additional, null hypotheses' were ruled' out: The score 
in .HOT does not correlate .with thq admission interview ratings; , 
.With the subjective, ratings by . clinical instructors; nor with 
intelligence, as measured by Raven's non-verbal intelligence test 
prior to admission (table 4). 



DISCUSSION 



There is no.doubt that medical practice requires appropriate 

data collection, organization and interpretation. There is also a 

growing acknowledgement that medicine likewise requires creative 

thinking, reflected by the hypothesis generation phase of clinical 

inquiry (Elstein et al, 1978; Culter, 1979). And, further, there 

is considerable dissatisfaction with medical education, expressed 

in several rather critical recent articles (e.g., Maddis^ 1978) 

in regard to the acquisition of problem-solving capabilities. ' 

However, there is no consensus on the nature of the def f iciencies 

demonstrated by medical students., While some authors focus on cue 

acquisition capabil i t ies^ (Berber & Tremoriti, 1977) , others have 

found that the fault is dack in the ability to generate hypothesis 

early enough (Neufeld, 1977; Dornhorst & Hunter, 19.67)." 

Resolution of this debate has a meaningful bearing on the planning. 

of-new instructional "experiences or'clanging the existing ones. 

Such resolution requires measuring instruments, . it is suggested 

"that the -hypothetical simulation' approach presented above may 
serve this end. 



, The/main feature of the> test is that there is no ultimate 
truth., No hidden reality should be discovered; no actual 
existence influences the flow of events. Moreover, there are 
data to be collected,, drawn, accumulated or exposed; 'the data are 
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explicitly given. The student may choose any set of data to 

comprise a unique universe of his own,^ into which the solution 

should fit. The mental process required here is a partial 

revision of ElsTein's clinical inquiry approach, it starts with 

an interpretation of the given cues. on a wither low cognitive 

level. Then it calls upon extensive hypotheses generation/ while 

selecting the options out of the many offered. The questions that 

the examinee faces is "which cues should be selected in order to 

fit the hypothesis best" rather than "which hypothesis fits the ' 
facts" . ■ 



The most important advantage of the test presented here is" 
the preclusion of any -pattern recognition' shortcut. Under no 
circumstances can an examinee bypass the hypothesis generation 
s,tage and evoke a reflexive response to a familiar situation. As 
both the interpretation and the selection of the presented options 
are relatively simple and require a low level of cognition, it is, 
suggested that HOT goes a long way tpwacds focusing exclusively on 
.the hypothesis-generation process. 

The construct validity of HOT r both ;cqnvetgent and 
discriminant,, points -to some similarities with the PMP , which are 
larger in the case of a previously unrecognized problem. The 
results also demonstrate a fundamental divergence from the 
"knowledge" component. This may "shed some light on the argument 
about content dependency of PMPs (Robinson & Dinham, 1977). The 



14 



possibility that content-dependency merely reflects pattern 
recognition must be considered. It is demonstrated that there is 
no correlation with 'knowing' the content area insofar as the 
pattern cannot be recognized.^ 



The criterion validation of HOT is beyond the scope of the 

present communication, and should await additional research data. 

However, it is assumed that HOT will be found to have criterion 

validity as high or as low as the PMP. This question is still 

debatable^ (Goran et al,*1973), in spite of the high face validity 
of the 'realistic simulations'. ' ' 1 



The issues ,of relevance vs. student motivation is also 
further illuminat6d. Students never rejected the HOT on the 
grounds of irrelevance. Their motivation level was, and still is 
high in spite (or because?) of the unrealistic situation. This 
observation is in accord win Bruner's postulate (Moore & Anderson, 
;1969) that there is "... joy and confidence- in the use of the 
mind- expressed by others as an "intrinsic reward value" of/ 
problem-solving (Barrows & Mitchell, 1975). it should be very 
clearly stated that the authors do not suggest a replacement of . 
PMPs and other clinically relevant techniques used in evaluation, . 
but rather to supplement them, without being- overly concerned by 
the„*ref lection, of reality issue. 



It is interesting to follow the creative process of the 
students by monitoring their decision on the unavoidable 
contradiction. There are examinees who try hard to readjust their 
hypotheses over and over again in order to avoid contradiction. 
' There are' others who deliberately introduce the contradiction 
early in order to- enable an easier flow thereafter, still others 
encpunter the difficulty late, only to find that, their entire 
solution is erroneous. Some students are 'systematic thinkers' 
and identify our scoring flow-charts intuitively, others are not 
aware of the ties. between certain blocks, scrutinizing each block 
against their hypothesis instead of forming clusters of blocks to 
be checked together. Although no quantitative data are available, 
this observation supports a recently published assumption on the 
existence of cognitive" styles (Tamir et al, 197,9} , which were 
defined as the recall, principled, questioning and application '. 
approaches. 



The HOT scoring system laid relatively, considerable weight on 
•originality of the solution (10%). This reflects an attempt to ■ 
reward inductive thinking, on the verge of guessing, it las been 
stated that guessing/ or 'wild imagination' is required for 
creating a clarifying environment. It has been also shown that 
creativity is correlated with the ability to arouse new 
associations, detached from the trigger stimulus (a •chain' 
pattern) (Levin, 1973). Nevertheless, we must admit that 
summative evaluation is not the most appropriate situation for 
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assessing the creative imagination, unless this is an explicit 
objective of the evaluation. 



, The content areas 'of HOT are welrl-nigh unlimited and depend 

^ entirely on the available resource people. The test "can be easily 
applied to any phase in the course of studies. (indeed it may be 
^ applied long before the university level), a case in whioh life 

in space where proteins do not exist (a sort of "Andromeda seed") 

' A ' \ 

is one extreme example derived from a cellular rather than organ 
biology. Solution for a non-existing inborn error of protein 
metabolism is another example, derived from the same content area. 
As the findings support the assumption of but a loose content 
dependency, the actual problem presented is of secondary 
importance. Alterations are also possible in the entire scoring 
system, including the assessment of originality, it also 'may be 
useful to- further develop mechanized^scoring . Thus the HOT 
represents an example of ,the idea of detachment from reality in . 
order to measure intermediate stages of* problem-solving, rather 
J than a structured instrument.- 



t 
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TABLE 1 

Examples of 'Building Blocks' 
(numbered according to original sequence in 1979 examination). 

1. The animal's body temperature is; 

a. Varies in accordance with environmental conditions. 

b. Constant at valuta of 36-38°c 

c. Constant at. values of between 20-30°c at nioht 
. and 36-38 °C during, the day. 

d. Constant at values of between 35-40°c at 
night and 10-20°C during the day. 

7. The ionic composition of the extracellular 

fluid of the animal in relation to the environment is: 

a . isotonic ' ' . • * 

b. hypotonic 

c. hypertonic > • 

d. varies with food and liquids absorbed 

8. The animal muscles are: 

'a. a large. mass relative to body weight 

b. a small mass, most of which ape smooth, and a minority striated, 

c a large mass, most of which are trunk muscles, and 

a minority limb muscles, 

d. a large mass' mostly in the limbs. 

22. The animals major mechanism for reaction speed is: 

a. decreasing cortical inhibition on relexes 

b. increasing cortical control of reflexes 

c. increasing sensitivity to peripheral sensory 
stimuli . < i ' 

d. the motor system is under sub-cortical control 
x (extra-pyramidal) 

27. The mechanism of regulation of the animal's heart 
rate is> 

a. Self-regulation by means of a pacemaker in the' 
heart (or Wh of the hearts) without a central 
control • -X - 

b. central regulation without pacemaker (s) K ' 

c. regulation of the\£low by change in peripheral 
resistance without pacemaker(s) 

d. the animal has no heak at all. 



i 
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TABLE 2- 

Additional Summarized •building blocks 



Summarized examples of the available 
options 



* Weight and Metabolic rate ' Several combinations of weight and 

0y consumption 



* Breathing apparatus 

* Respiratory pigments 

* Movement 

* Intestine 

* Alimentary enzymes 

* Temperature regulation 



Several combinations of rate of 
gas exhange and depth of cavities. 

Several metals with different affinity 
to oxygen. 

Alertness and activity; sleeping 
habits; posture; locomotion. 

Number and length of Segments; 
pH in each. 

Several combinations of enzymes 
Constituents of integument. 



One of. the options, is that the structure under discussion 
does not exist at all in the organism. 
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Table 3 .* : 

.Mulitrait - multimethod correlation matrix 
of early vs. clinical .evaluations (N = 66) 



Integrative Examination (year l) 



Clinical Evaluation (years k-6) 





Trait 


Problem- 
Solving 


Knowledge 
of content 


Interpersonal 
skills 


Problem- 
Solving 


Knowledge 
of content 


"Interpersonal 
skills 






HOT 


MCQ* 


COMMIJN. TEST** 


nun 

xtWt* 


MCQ*** 


RATINGS**** 


ST 

> o 

■H -H 
^ CO 

HI W 


Problem- 
Solving (HOT) 

Knowledge of 
content (MCQ) 

Interpersonal 
skills (COMM.) 


.06 
.23 


.78 
.00 








a 

\ 

' Ti 


c 

-.5 

j *-«• 
i 3 

5 tH 
4 « 

■1 > 
> w 


Problem- 
solving (PMP) 

Knowledge of 
<w content" (MCQ) 

Interpersonal 
skills (RATINGS) 


.33+ 
.05 

-.03 


.22 
. .1»6+ 
.10 


. 04 
-.11' 
.13+ 


.21 % 
.34 
. 46 


.43 
. 31 


.68 



* Over other subtests of the same examination relating to the same content area as HOT 
** Another subtest of the Integrative Examination, measuring Communication skills (written test) 
**» Over- End-of -Clerkships MCQ tests 
••«• Faculty -Ratings on a checklist specifying behaviors 



ERIC . f* 



24 



22 



I 



TABLE 4 

Correlations Between HOT gnd both 
' Student»s Achievements and Admission Criteria 



Source 



Admission criteria: 

Intelligence 

Interviews Score 
Achievements: 

Scores in other 
subtests of* same 
examination (1st year) 

Mean Scores on PMP's 
in pediatrics, ob-gyn 
and primary care 
(4th, 5th and 6th years) 

Score on PMPs, medicine, 
final (6th year) 

Score on PMP of a rare/ 
case, primary care 
(5th Year) 

Assessment by clinical 
, instructors over 
clerkships 

(4th, 5th and 6th years) 



No. of ., , 
Students 



192 
192 



192 



66 
30 

30 



Correlation^ 
Coefficient 



.07 
.00 



.06 



.26' 



.37' 



.43 



++ 



66 



-.20 



* The different N's represent the number of .classes which 
reached each phase. 

** Pearson's Product Moment correlation coefficient 
+ P< .05 . . , 



++ P< .01 



